Block-Row Sparse Matrix-Vector Multiplication on SIMD Machines

نویسندگان

  • Nirav H. Kapadia
  • José A. B. Fortes
چکیده

The irregular nature of the data structures required to efficiently store arbitrary sparse matrices and the architectural constraints of a SIMD computer make it difficult to design an algorithm that can efficiently multiply an arbitrary sparse matrix by a vector. A new ‘‘block-row’’ algorithm is proposed. It allows the ‘‘regularity’’ of a data structure with a row-major mapping to be varied by changing a parameter (the ‘‘blocksize’’); a heuristic to find a very good approximation of the optimal blocksize is also described. The block-row algorithm has been implemented on a 16,384 processor MasPar MP-1, and, for the matrices studied, the algorithm was found to be faster than any of the other algorithms considered.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simd Sparse Matrix-vector Multiplication Algorithm for Computational Electromagnetics and Scattering Matrix Models

Kipadia, Nirav Harish. M.S.E.E., Purdue University. May 1994. Pi SIMD Sparse Matrix-Vector Multiplication Algorithm for Computational Electromagnetics and Scattering Matrix Models. Major Professor: Jose Fortes. A large number of problems in numerical analysis require the multiplication of a sparse matrix by a vector. In spite of the large amount of fine-grained parallelism available in the proc...

متن کامل

Run-Time Optimization of Sparse Matrix-Vector Multiplication on SIMD Machines

Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientific computations (e.g., finite element methods). In such solvers, the matrix-vector product is computed repeatedly, often thousands of times, with updated values of the vector until convergence is achieved. In an SIMD architecture, each processor has to fetch the updated off-processor vector el...

متن کامل

Run - Time Optimization of Sparse Matrix - Vector Multiplication onSIMD

Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientiic computations (e.g., nite element methods). In such solvers, the matrix-vector product is computed repeatedly, often thousands of times, with updated values of the vector until convergence is achieved. In an SIMD architecture, each processor has to fetch the updated oo-processor vector elemen...

متن کامل

Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors

The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...

متن کامل

SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision

We accelerate a double precision sparse matrix and DD vector multiplication (DD-SpMV), and its transposition and DD vector multiplication (DD-TSpMV) by using SIMD AVX2 for Krylov subspace methods. We compare some storage formats of DD-SpMV and DDTSpMV for AVX2 to eliminate performance degradation factors in CRS. Our experience indicates that BCRS4x1, with fitting block size to the SIMD register...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995